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In  two-class  pattern  recognition,  it  is  a standard  technique  to  have 
an  algorithm  for  finding  hyperplanes  which  separates  the  two  classes 
in  a linearly  separable  training  set.  The  traditional  methods  find  a 
hyperplane  which  separates  all  points  in  one  class  from  all  points  in 
the  other,  but  such  a hyperplane  is  not  necessarily  centered  in  the  ^ 
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empty  space  between  the  two  classes.  Since  a central  hyperplane 
does  not  favor  one  class  or  the  other,  it  should  have  a lower  error 
rate  in  classifying  new  points  and  is  therefore  better  than  a non- 
central hyperplane.  Six  algorithms  for  finding  central  hyperplanes 
are  tested  on  three  data  sets.  Although  frequently  used  in  practice, 
the  modified  relaxation  algorithm  is  very  poor.  Three  algorithms, 
which  are  defined  in  the  paper,  are  found  to  be  quite  good. 
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EXPERIMENTS  WITH  SOME  ALGORITHMS  THAT  FIND  CENTRAL 
SOLUTIONS  FOR  PATTERN  CLASSIFICATION 


1.  INTRODUCTION 

Linear  discriminant  functions  (hyperplanes)  play  a very  Important 
part  in  automatic  pattern  recognition  [1,2,3,4,5,7,83.  This  paper 
explores  a number  of  improved  methods  of  finding  hyperplanes  lAich  are 
better  because  they  are  more  nearly  central  between  two  different 
classes.  This  is  sometimes  called  the  problem  of  finding  central 
hyperplanes. 

To  C9nsider  this  problem  in  more  detail,  suppose  we  are  given  a 
linearly  separable  training  set  of  pattern  vectors  in  two  classes. 
"Linearly  separable"  means  that  the  two  classes  can  be  separated  by 
some  hyperplane.  There  are  many  algorithms  that  will  find  a solution 
hyperplane,  however  such  methods  guarantee  only  that  all  points  of  one 
class  from  the  training  set  will  be  on  one  side  of  the  hyperplane  and 
that  all  points  of  the  other  class  will  be  on  the  other  side.  But 
such  a hyperpdane  is  not  necessarily  centrally  located  in  the  empty 
space  between  the  two  classes. 

In  order  to  understand  why  a central  hyperplane  is  desirable  we 
must  remember  that  the  given  points  (the  training  set)  are  only  a 
sample  of  the  universe  of  points  that  we  want  to  classify.  The  purpose 
of  the  hyperplane  is  to  classify  new  points  (test  vectors)  from  the 
universe.  When  a new  point  is  given,  common  sense  tells  us  to  assign 
it  to  that  class  which  its  nearby  neighbors  belong  to.  Only  a central 
hyperplane  will  do  this. 

Figures  1 and  2 show  an  example  in  two  dimensions.  The  A and  B 
points  are  the  training  set.  Figure  1 shows  a poor  solution  hyperplane 
and  Figure  2 shows  a much  better  (more  central)  solution  hyperplane. 

X represents  a new  point.  Common  sense  tells  us  that  X should  be  in 
class  B,  but  the  poor  hyperplane  puts  it  into  class  A.  The  good 
hyperplane  of  Figure  2 correctly  puts  X into  class  B. 

Of  course,  the  ideal  way  to  solve  this  kind  of  problem  is  to  fit 
a probability  density  function  to  each  class  by  some  process  of 
statistical  inference,  and  then  choose  the  surface  of  equal  probability 
density  as  the  decision  boundry  between  the  two  classes.  However,  such 
a surface  is  not  necessarily  flat  and  the  process  of  finding  such  a 
surface  may  be  complex  and  expensive. 


Note:  Manuscript  lubmittad  September  7,  1977. 

1 


training  set  as  in  Fig.  1 
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Therefore  finding  a central  hyperplane  is  a simple  approximation 
which  is  adequate  for  many  practical  applications.  In  fact  the  central 
hyperplane  may  often  be  the  optimal  solution  to  a practical  pattern 
recognition  problem,  when  economic  factors  are  considered. 

Slagle  [7]  discusses  the  problem  of  finding  a central  hyperplane 
and  presents  an  algorithm  which  is  the  forerunner  of  algorithm  E below. 
Yue  Ls]  gives  some  algorithms  which  maintain  a dead  zone  of  constant 
width  vdille  finding  a solution  hyperplane  by  a relaxation  process. 

Yue  gives  a heuristic  procedure  for  trying  to  guess  a suitable  width 
for  the  dead  zone  so  that  the  hyperplane  will  be  approximately  centered. 
Unfortunately,  it  is  easy  to  give  simple  counter  examples  to  the 
heuristic,  and,  if  the  heuristic  guesses  too  high,  no  solution  exists 
at  all. 


2.  NOTATION 

We  use  a development  and  notation  similar  to  that  used  by  Meisel 
[5].  Let  the  weight  vector  W ■ (wi»«..,Wn,w),  where  V ■ (wi,...,Wq) 
is  the  vector  which  defines  the  orientation  of  the  hyperplane  (V  is 
perpendicular  to  the  hyperplane). 

We  shall  sometimes  write  W » (V,w).  Let  the  length  of  V be  v, 
that  is,  II  V jl  = V. 

Let  X be  an  (unaugmented)  pattern  vector  (x^^, . . . ,Xn).  Let  c(X)  be  • 

the  choice  function;  that  is,  if  c(X)  > k,  the  classifier  puts  X into 
class  k.  We  shall  sometimes  denote  class  k by  cj^.  Once  it  is  trained, 
our  classifier  puts  X into  class  lifV*X  + w>0  and  into  class  2 
if  V • X + w < 0.  The  solution  hyperplane  H is  given  by  V • X + w ■ 0. 

It  is  well  known  that  the  distance  from  the  origin  to  the  hyperplane  is 
-w/v.  Suppose  that  we  are  given  a training  set  X]^,...,Xq  of  pattern 
vectors.  The  augmented  vector  corresponding  to  the  pattern  vector  X^  is 

(X. ,1)  if  X.  is  in  class  1. 

Y - ^ 

If  \ Is  in  class  2. 

Now  let  us  consider  the  dead  zone.  Roughly  speaking,  this  is  the 
zone  surrounding  the  hyperplane  which  contains  no  points  from  the 
training  set. 

Let  b be  a positive  real  number.  We  chose  b > 1 in  our  experiments, 
b is  related  to  the  dead  zone  width  as  follows.  We  want  to  train  our 
classifier  (that  is,  find  a W)  so  that  it  puts  X^  into  class  1 if 
V • Xi  + w 2 b and  into  class  2 if  V • Xj^  + w i -b.  See  Fig.  3. 

This  can  be  accomplished  by  training  the  classifier  so  that 
W • Yj^  2 b for  all  i ■ l,...,m.  That  is,  we  want  e^  2 0 where 
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= W • Yi  - b.  Recall  that  is  the  n-dlmensional  distance  from 
the  pattern  to  the  appropriate  hyperplane  dead  zone  boundary.  Half 
of  the  dead  zone  width  is  b/v. 


3.  BROADENING  THE  DEAD  ZONE  OF  A SOLUTION  HYPERPLANE 


Each  of  the  six  algorithms  we  shall  describe  uses  a broadening  sub- 
algorithm or  a centering  subalgorithm  to  try  to  improve  the  solution  it 
finds.  The  broadening  subalgorithm  increases,  if  necessary,  the  breadth 
of  the  dead  zone  just  enough  so  that  one  of  its  boundaries  passes 
through  the  nearest  training  vector.  More  formally,  let  h - min^e^. 

The  boundary  and  the  boundary  B2  are  moved  a distance  h/v  away 
from  the  dead  zone  D and  towards  class  1 and  class  2 respectively.  The 
new  half  width  = b/v'  = b/v  + h/v.  Hence,  v'  = bv/ (b  + h)  ■ cv  where 
c * b/(b  + h).  The  distance  of  the  solution  hyperplane  from  the  origin 
should  be  preserved.  Hence,  -w'/v'  » -w/v.  Therefore,  w'  = v'w/v  » cw. 
We  meet  all  the  conditions  by  setting  W'  = cW. 


4.  CENTERING  AND  OVERCENTERING  A SOLUTION 

Four  of  the  algorithms  use  the  concept  of  centering  the  weight 
vector  W,  and  the  sixth  algorithm  also  uses  the  concept  of  over- 
centering.  We  now  define  fg-centerlng  from  which  centering  and 
overcentering  are  defined  later.  Intuitively,  fg-centering  of  W 
moves  the  boundary  B]^  a distance  f/v  away  from  the  dead  zone  D and 
moves  the  boundary  B2  a distance  g/v  away  from  D.  Let  v'  be  the  new 
V after  the  fg-centering.  The  new  half  width  of  the  dead  zone  is 
b/v'  * b/v  + f/ (2v)  + g/(2v).  Hence,  v'  » 2bv/ (2b  + f + g)  » cv  where 
c = 2b/ (2b  + f + g).  Therefore,  V'  » cV.  The  new  distance  of  the 
solution  hyperplane  H from  the  origin  is  -w'/v'  = % (-w/v  + f/v  - 
w/v  - g/v).  Hence,  w'  » (2w  + g - f)  c/2.  Therefore,  the  fg-centerlng 
of  W is  the  computation  of  a new  W,  namely,  (V',w'). 

Centering  is  defined  as  fg-centering  where  f * ^®i^ 

g = min^^(e£).  Intuitively,  centering  expands  the  dead  zone  as  much 
as  possible  until  it  bumps  into  a training  point  (vector)  in  c^^  and  a 
training  point  in  C2. 


5.  THE  ALGORITHMS  TO  BE  TESTED 


The  six  algorithms  to  be  compared  are  broadened  modified  relaxation, 
centered  modified  relaxation,  broadened  accelerated  relaxation,  centered 
accelerated  relaxation,  extended  centered  accelerated  relaxation,  and 
centered  overcentered  accelerated  relaxation.  The  first  two  are  based 
on  modified  relaxation  [2,4,51.  The  next  two  are  based  on  accelerated 
relaxation  [ll.  The  last  two  we  introduce  here. 


A.  Broadened  Modified  Relaxation.  This  algorithm  examines  the  training 
set  pattern  vectors  one  at  a time.  Let  Wj^  be  the  weight  vector 
after  the  examination.  Let  W-  be  arbitrax^r,  for  example, 

Wg  - (0,0, 

Wk"?  (ej^-c)y^/s^  if  e^  < 0 

V * 

fc+1 

W,  otheirwise 

L k 

where  c > 0 and  » lly^^lP* 

The  algorithm  continues  until  the  training  set  is  linearly  separated 
or  has  been  examined  P times.  When  a solution  is  found,  it  is 
broadened.  This  algorithm  (without  the  broadening)  is  often  used  in 
practice,  because  it  can  be  proved  [2,4,5]  that,  if  the  two  classes 
in  the  training  set  are  linearly  separable,  the  algorithm  will  find  a 
solution  in  a finite  number  of  steps.  In  our  implementation,  we  set 
P » 100,  c » 0.1,  and,  with  the  help  of  advice  from  [4],  p » 1.999. 

If  c is  dropped  from  the  above  equation,  this  algorithm  without  the 
broadening  becomes  the  (unmodified)  relaxation  algorithm. 

B.  Centered  Modified  Relaxation.  This  is  the  same  as  broadened 
modified  relaxation,  except  that  the  solution  is  centered  at  the  end 
Instead  of  broadened. 

C.  Broadened  Accelerated  Relaxation.  The  accelerated  relaxation 

algorithm  due  to  C.  L.  Chang  LlJ  has  two  parts,  the  relaxation  phase 
and  the  acceleration  phase.  These  two  parts  are  repeated  alternately. 
The  process  begins  with  an  arbitrary  initial  weight  vector  W^.  After 
one  complete  pass  through  the  set  of  all  training  vectors,  W”  becomes 
wf,  in  general  becomes  The  relaxation  process  of  algorithm 

A is  used  during  this  pass.  W^  is  the  result  of  the  relaxation 
process  and  is  the  result  of  the  acceleration  process. 

During  the  acceleration  phase  wf^j^  becomes  This  is 

done  by  constructing  a line  from  the  point  to  and  then 

extending  this  line  beyond  A search  is  then  made  along  this 

extension  for  the  interval  of  best  separation  score. 
center  of  the  Interval.  The  procedure  is  repeated  up  to  30  times  or 
until  it  converges  (which  means  that  the  training  set  is  linearly 
separated).  In  general,  accelerated  relaxation  is  much  faster  than 
modified  relaxation  [ll.  Our  algorithm  broadens  the  solution  found  by 
the  accelerated  relaxation  algorithm. 

D.  Centered  Accelerated  Relaxation.  This  is  the  same  as  the  broadened 
accelerated  relaxation  algorithm,  except  that  the  solution  is  centered 
at  the  end  instead  of  broadened. 
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E.  Extended  Centered  Accelerated  Relaxation.  The  resulting  weight 
vector  from  centered  accelerated  relaxation  (algorithm  D)  Is  given  as 
the  Initial  weight  vector  to  the  modified  extended  relaxation  algorithm 
given  by  equation  (3)  below  with  c = 0.1,  p = 1.999,  and  the  number  P 
of  passes  through  the  training  set  being  ten.  The  weight  vector  obtained 
is  then  centered.  Thus,  the  major  part  of  this  algorithm  Is  the  modified 
extended  relaxation  algorithm,  which  Is  described  In  the  rest  of  this 
section. 

Before  describing  the  modified  extended  relaxation  algorithm  precisely, 
we  discuss  the  Intuitive  Idea  upon  which  it  Is  based.  The  relaxation 
algorithm  (See  Algorithm  A Above)  is  frequently  used  in  practice.  It 
obtains  a discriminating  hyperplane  by  trying  to  minimize  the 
(approximate)  relaxation  risk  R where  R is  the  relaxation  loss 
averaged  over  the  training  set.  We  now  present  a very  simple  pattern 
recognition  training  set  In  order  to  Illustrate  the  advantage  of  our 
approach  (extended  relaxation)  over  the  relaxation  approach.  Consider 
the  two-class  two-pattem  one-dimensional  pattern  recognition  training 
set  shown  in  Fig.  4.  It  consists  of  two  patterns  (points)  » 2 and 
X2  = 6 with  X^  in  class  1 and  X2  in  class  2.  Intuitively,  the  best 
point  to  divide  class  1 from  class  2 Is  X 4,  the  midpoint  of  the 
Interval  from  Xj^  to  X2.  The  classifier  should  classify  a pattern 
from  the  test  set  into  class  1 if  it  lies  to  the  left  of  X = 4 and 
into  class  2 if  to  the  right.  It  turns  out  that  our  approach  yields 
X = 4,  whereas  the  relaxation  approach  may  choose  the  dividing  point 
X anywhere  in  the  open  irjterval  (2,6).  Our  particular  Implementation 
of  relaxation  (algorithm  A)  chooses  X = 3.0,  which  is  poor. 

For  the  simple  training  set  we  are  considering.  Fig.  5 shows  the 
part  of  the  relaxation  loss  (for  the  training  set)  due  to  the  position 
of  the  left  boundary  of  the  dead  zone.  We  are  considering  the 
simple  case  when  the  loss  is  taken  to  be  proportional  to  the  square 
of  the  distance  between  Bj^  and  Xj^,  when  Xj^  is  on  the  wrong  side  of  Bj^; 
the  loss  is  zero,  when  X]^  is  on  the  correct  side  of  B^.  Fig.  6 
shows  the  part  of  the  loss  due  to  the  position  of  the  right  boundary 
B2  of  the  dead  zone.  It  is  clear  that  the  total  loss  is  zero  as  long 
as  both  dead  zone  boundaries  are  in  the  closed  interval  [2,6].  There- 
fore, the  dividing  point  X may  be  any  point  in  the  open  interval  (2,6). 

In  our  new  (extended  relaxation)  approach,  we  use  a loss  function 
like  the  one  shown  in  Fig.  7 and  Fig.  8.  Instead  of  being  zero  when 
X]^  is  on  the  correct  side  of  Bn,  the  loss  is  a small  positive  fraction 
of  the  sqtiare  of  the  distance  between  B]^  and  X]^.  This  and  the  similar 
fact  about  X2  and  B2  account  for  the  two  gently  sloping  semiparabolas 
in  Fig.  7 and  Fig.  8 respectively.  It  is  clear  that  the  losses  are 
minimized  (zero  In  fact)  only  when  the  dead  zone  boundaries  are 
Bi  * 2 and  B2  = 6.  This  yields  the  central  dividing  point  X » 4 and 
Its  advantages.  We  next  generalize  these  Ideas  to  more  points  and 
higher  dimensions. 
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We  now  define  the  extended  relaxation  loss  and  derive  the 
(approximate)  extended  relaxation  risk  from  this  loss.  We  shall  then 
derive  the  extended  relaxation  algorithm  as  an  algorithm  that  works  on 
minimizing  this  risk.  Finally,  we  modify  the  extended  relaxation 
algorithm  to  obtain  the  modified  extended  relaxation  algorithm.  Let 
LLj,k]  be  the  loss  associated  with  classifying  a pattern  into  class  J 
when  it  is  actually  in  class  k.  Let  s - Hvip  - 1 + \\x\\^.  The  defining 
equation  for  the  extended  relaxation  loss  is  the  following,  where 
k - 1,2. 


L[c(X),k] 


te^/s  if  e ^ 0 
e /s  otherwise 


When  t is  zero  this  degenerates  to  the  relaxation  loss.  We  use  the 
following  expression  for  t,  which  empirically  we  have  found  to  be 
reasonably  good,  t ■ 0.73/(l  + m)  where  m is  the  mnnber  of  training 
samples.  The  (approximste)  risk  is  the  average  loss  over  the  training 
set  as  follows. 

R(W)  - (iMCt^i/'i  +ye|/Si]  (1) 

e^^O  ej^<0 


We  shall  now  use  the  usual  gradient  technique  to  derive  an  algorithm 
that  works  on  minimizing  the  extended  relaxation  risk.  Let  Vi  be  the 
weight  vector  at  Iteration  q grad  R(Wj^)  we  have 

“ (2q/m)[t^  «lYl/*i  +23  (2) 

e^<0 

Let  p - 2q/m.  When  t is  zero  this  degenerates  to  the  many-at-a>time 
relaxation  algorithm.  In  the  many-at-a-time  modified  relaxation 
algorithm,  a positive  constant  c is  subtracted  from  e^  to  guarantee 
finite  convergence  in  the  linearly  separable  case.  We  do  the  analogous 
thing  to  (2)  to  obtain  the  many-at>a-time  modified  extended  relaxation 
algorithm.  The  positive  constant  c is  added  to  e^  when  e^  ^ 0. 


'X+1 


pCtV  (ei+c)Yi/8i  + J] 
e^O  e^ 


(•i-c)  Y^/s^] 


This  algorithm  can  be  changed  in  the  usual  way  to  the  following 
one>st-a-time  algorithm. 
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Wjj  - p(e^-c)Y^/s^  otherwise 

This  is  our  modified  extended  relaxation  algorithm.  It  is  the  major 
component  of  our  extended  centered  accelerated  relaxation  algorithm, 
as  explained  at  the  beginning  of  this  section. 

F.  Centered  Overcentered  Accelerated  Relaxation.  This  algorithm 
carries  out  algorithm  D (centered  accelerated  relaxation),  overcenters 
the  solution  (that  Is,  expands  the  dead  zone  a little  more  than 

centering  would),  carries  out  algorithm  0 again,  overcenters  again,  { 

etc.  We  now  define  overcentering.  Let  f ■ min-  (e4)  Let  g"  be  the  ■ 

i 

next  strictly  larger  such  minimum.  Let  g'  > mln^  (e^^).  Let  g"  be  ' 

the  next  strictly  larger  such  minimum.  Suppose  t^at  f''-f'2g"-g'. 

(The  other  case  Is  defined  analogously.)  Overcentering  Is  defined  as 
f g''  - centering. 

The  centered  overcentered  accelerated  relaxation  algorithm  uses 
the  accelerated  relaxation  algorithm  a total  of  kmax  times,  where 

kmax  Is  typically  five.  The  algorithm  Is  the  following:  \ 

i 

(1)  Set  kmax. 

(2)  Set  Wg"  - (0,0,..., 0). 

(3)  Set  k - 1.  I 

(4)  Starting  with  let  the  solution  found  by  the 

accelerated  relaxation  algorithm  be 

(5)  Obtain  Wj^'  by  centering  . 

(6)  Type  the  centrality  criteria  c,  and  c,  for  W.  '.  (See  i 

Section  6 . ) 

(7)  If  k - kmax,  type  the  best  results  (the  results  for  the  j 

best  C]^,  and  the  results  for  the  best  C2)  and  stop.  ! 

(8)  Obtain  W"  by  overcentering  Wj^' . 

(9)  Set  k - k + 1.  I 

(10)  Go  to  (4).  ( 

6.  CENTRALITY  CRITERIA 

The  concept  of  the  central  hyperplane  has  never  been  precisely 
defined.  It  remains  an  Intuitive  notion.  Nevertheless,  we  need  a 
definite  measure  of  centrality  In  order  to  compare  the  various 
algorithms.  We  compromise  by  giving  two  distinct  quantitative 
measures  of  centrality  but  we  do  not  specify  the  relative  Importance 
of  these  criteria. 


i! 


(3)  i 

I 

i 

I 
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Suppose  that  we  have  a solution  hyperplane  W • Y - b » 0.  We  j 

assume  that  the  solution  has  been  broadened  or  centered.  Our  first 


centrality  criterion  is  half  the  dead  zone  width  (W)  - b/v.  The 
larger  this  criterion  is,  the  more  central  the  solution  hyperplane 
H tends  to  be. 

We  now  obtain  our  second  centrality  criterion  from  Equation  (1) 
for  the  extended  relaxation  risk.  Recall  that  we  apply  a criterion 
only  if  the  solution  with  dead  zone  correctly  classifies  every  vector 
in  the  tralni^  set,  that  is,  only  if  e^^  ^ 0 for  all  i.  We  obtain 
CoCW)  • t ^ e?/s^.  The  smaller  this  criterion  is,  the  more  central 
the  solution  hyperplane  tends  to  be. 

7.  EXPERIMENTS  WITH  THE  ALGORITHMS. 

Each  of  the  six  algorithms  was  run  on  each  of  three  data  sets. 

The  iris  data  consists  of  four  measurements  on  each  of  sixty  Irises, 
the  first  twenty  from  each  of  three  classes.  This  is  a subset  of  the 
famous  set  treated  by  R.  A.  Fisher  [3].  The  Fig.  9 data  consists  of 
the  artificial  set  shown  in  Fig.  9.  This  set  was  chosen  because  it  is 
simple  to  consider  and  yet  it  is  difficult  for  many  algorithms  to  get 
an  optimal  solution  according  to  our  criteria.  The  JU  data  consists 
of  eight  measurements  on  each  of  thirty  hand  printed  J's  and  thirty 
hand  printed  U's.  This  set  is  described  and  used  in  [b]. 

The  results  for  all  three  data  sets  are  summarized  in  Table  2. 
Percentages  are  averaged  over  the  three  data  sets.  That  is,  each 
entry  is  a third  of  the  sum  of  the  three  percent  entries  taken  from 
Table  1 and  the  two  tables  for  the  iris  and  JU  data  sets. 

8.  CONCLUSIONS 

The  conclusions  are  based  on  experiments  with  three  specific  data 
sets  and  therefore  may  not  be  true  in  general.  However,  they  do  not  go 
against  intuition. 

(1)  The  last  three  algorithms  are  substantially  better  than  the 
first  three.  If  computer  time  is  available,  one  should  try  algorithms 
0,  E,  and  F and  take  the  best  result  obtained. 

(2)  If  computer  time  is  short,  one  should  use  algorithm  D,  which 
is  fast  and  is  central  but  not  the  most  central  solution. 

(3)  Although  slower  than  algorithm  D,  algorithms  E and  F find  the 
most  central  solutions. 

(4)  Centering  Improves  the  solution  tremendously  and  takes  very 
little  time.  This  can  be  proved  by  comparing  algorithm  A and  B and  by 
comparing  algorithms  C and  D in  Table  2. 

(5)  Although  frequently  used,  the  modified  relaxation  algorithm  is 
particularly  bad.  It  is  slow  and  finds  solutions  that  are  not  at  all 
central. 

(6)  The  broadened  accelerated  relaxation  algorithm  is  the  fastest, 
but  the  solutions  are  not  at  all  central. 
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Fig,  9 - Artificial  data  set.  The  members  of  Class  A ar^ 
(-5,  30),  (0,  30),  (5,30),  (10,  30),  and  (15,30).  The  mem- 
bers of  Class  B are  (-20, 10),  (-15, 10),  (-10,  10),  (-5,  10) 
(0,  10),  and  (5,  10). 


Table  1.  Results  for  the  Fig.  9 Data.  Averages  are  Taken 
Over  the  Six  Algorithms. 


Kind  of 
Relaxation 
Algorithm 

Avg.  % for 
time 

Avg.  % for 

Cl 

Avg.  % for 

C2 

Broadened 

Modified 

110 

75 

212.1 

Centered 

Modified 

118 

104 

83.9 

Broadened 

Accelerated 

63 

54 

272.9 

Centered 

Accelerated 

71 

118 

16.0 

Extended 

Centered 

Accelerated 

114 

123 

6.3 

Centered 

Over  Centered 
Accelerated 

125 

126 

8.8 

Table  2.  Sunmary  of  Results.  Percentages  are 
Averaged  Over  the  Three  Data  Sets. 
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